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(57)Abstract: 

PROBLEM TO BE SOLVED: To solve a problem that 
real time property is damaged and a flexible connection 
form is difficult in the case of performing a video 
conference of a large scale through a narrow band LAN 
because data to be communicated are large in a 
television conference system based on ITU-T 
recommendation. 

SOLUTION: Terminals 1 transmit multimedia data 
obtained by encoding video/sound data by MPEG-4 and 
MP4 formats to a server device 3, which streaming- 
distributes the multimedia data to all the terminals 
connected to the video conference system. The 
terminals 1 decode the multimedia data from the device 
3 to the video/sound data. An H.320 terminal 31 and an 
H.323 terminal 33 can hold an MPEG-4 style video conference system by intermediating 
conversion processors 33 and 35 provided with the function of an MPEG-4 terminal. 
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* NOTICES * 

JPO and NCIPI are not responsible for any 
damages caused by the use of this translation. 

1 .This document has been translated by computer. So the translation may not reflect the original 
precisely. 

2.**** shows the word which can not be translated. 
3. In the drawings, any words are not translated. 



CLAIMS 



[Claim(s)] 

[Claim 1] In the video conference system which has the server equipment which manages connection 
and a communication link of a video conference system, and two or more terminals connected to the 
server equipment concerned The image input device into which said terminal inputs image data, and the 
audio input unit which inputs voice data, With the connection Management Department which manages 
the terminal which has the transmitting processing section which transmits said image data and voice 
data to said server equipment, and connects said server equipment to a video conference system It has 
the message distribution processing section which carries out streaming distribution to all the terminals 
that have managed the speaker selection Management Department which chooses a speaker and sends 
out a selection signal, all the image data distributed from said two or more terminals, voice data, and 
said selection signal at said connection Management Department. Further said terminal The image 
output unit which displays the image data distributed from said server equipment, The video conference 
system characterized by having the audio output device which outputs the voice data chosen from the 
selection signal and voice data which have been distributed from said server equipment in the speaker 
selection processing section which chooses the voice data corresponding to a speaker, and the speaker 
selection processing section concerned. 

[Claim 2] In the video conference system which has the server equipment which manages connection 
and a communication link of a video conference system, and two or more terminals connected to the 
server equipment concerned said terminal The image input device which inputs image data, and the 
audio input unit which inputs voice data, The encoding processing section which encodes said image 
data and voice data to image compressed data and speech compression data (henceforth "multimedia 
data"), respectively, It has the transmitting processing section which transmits the multimedia data 
concerned to said server equipment. Said server equipment It has the message distribution processing 
section which carries out streaming distribution to all the terminals that have managed the connection 
Management Department which manages the terminal linked to a video conference system, and all the 
multimedia data transmitted from said two or more terminals at said connection Management 
Department. Furthermore, the decoding section which decrypts the image compressed data and speech 
compression data with which said terminal has been distributed from said server equipment to image 
data and voice data, respectively, The video conference system characterized by having the image output 
unit which was decrypted in the decoding section, and which outputs the image data concerned, and the 
audio output device which outputs the voice data concerned. 

[Claim 3] Said terminal is a video conference system according to claim 2 characterized by having the 
speaker selection processing section which chooses the MP3 data corresponding to [ have the speaker 
selection Management Department which chooses a speaker as either said terminal or said server 
equipment, and sends out a selection signal to it with MP3 data, and ] a speaker based on said selection 
signal, and sends out only the MP3 data concerned to said decoding section. 
[Claim 4] Said server equipment is a video conference system according to claim 2 characterized by 
having the speaker selection processing section which chooses the MP3 data corresponding to [ have the 
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speaker selection Management Department which chooses a speaker as either said terminal or said 
server equipment, and sends out a selection signal to it with MP3 data, and ] a speaker based on said 
selection signal, and carries out streaming distribution only of the MP3 data concerned. 
[Claim 5] It is the video conference system according to claim 2 to 4 which uses said image compressed 
data as MPEG-4 data, and is characterized by using said speech compression data as MP3 data. 
[Claim 6] In the video conference system which has server equipment which manages connection and a 
communication link of a video conference system The server equipment which adopts the first coding 
method, and the first terminal which adopts the first coding method, It has transform-processing 
equipment which changes mutually the second terminal which adopts the second coding method, the 
first coding method, and the second coding method. Said first terminal The image input device which 
inputs image data, and the audio input unit which inputs voice data, The encoding processing section 
which encodes said image data and voice data to image compressed data and speech compression data 
(henceforth "multimedia data"), respectively, The decoding section which decrypts the multimedia data 
which have the transmitting processing section which transmits the multimedia data concerned to said 
server equipment, and were distributed from said server equipment to image data and voice data, 
respectively, It has the image output unit which was decrypted in the decoding section and which 
outputs the image data concerned, and the audio output device which outputs the voice data concerned. 
Said server equipment With the connection Management Department which manages said first terminal 
linked to a video conference system, and said transform-processing equipment It has the message 
distribution processing section which carries out streaming distribution to all the terminals that have 
managed all the multimedia data by which it is transmitted from said two or more terminals, and said 
selection signal at said connection Management Department. The image input port where said second 
terminal inputs image data, arid the voice input port which inputs voice data, The image output port 
which outputs image data, and the audio output port which outputs voice data, It has the input/output 
processor which performs radial transfer between said transform-processing equipment while changing 
into the second coding method the image and voice data which are outputted and inputted in said port. 
Said transform-processing equipment is a video conference system characterized by having the function 
which changes mutually the coding method of the data transmitted from said server equipment or said 
second terminal, and is transmitted to another side, and having the function of said first terminal and 
said 2nd terminal. 

[Claim 7] Said transform-processing equipment is a video conference system according to claim 6 
characterized by having the speaker selection processing section which chooses the MP3 data 
corresponding to [ have the speaker selection Management Department which chooses a speaker as 
either said first terminal, said second terminal or said server equipment, and sends out a selection signal 
to it with voice data or MP3 data, and ] a speaker based on said selection signal, and sends out only the 
MP3 data concerned to said decoding section. 

[Claim 8] Said server equipment is a video conference system according to claim 6 characterized by 
having the speaker selection processing section which chooses the MP3 data corresponding to [ have the 
speaker selection Management Department which chooses a speaker as either said first terminal, said 
second terminal or said server equipment, and sends out a selection signal to it with MP3 data, and ] a 
speaker based on said selection signal, and carries out streaming distribution only of the MP3 data 
concerned. 

[Claim 9] It is the video conference system according to claim 6 to 8 which uses said image compressed 
data as MPEG-4 data, and is characterized by using said speech compression data as MP3 data. 



[Translation done.] 
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* NOTICES * 

JPO and NCIPI are not responsible for any 
damages caused by the use of this translation. 

l.This document has been translated by computer. So the translation may not reflect the original 
precisely. 

2 **** shows the word which can not be translated. 
3. In the drawings, any words are not translated. 



DETAILED DESCRIPTION 



[Detailed Description of the Invention] 
[0001] 

[Field of the Invention] This invention relates to the video conference system which connects mutually 

between remote places and performs a television conference. 

[0002] 

[Description of the Prior Art] In recent years, the video conference system which connects mutually 
among the many points in a remote place, and performs a television conference by development of 
communication technology has been developed. Such a video conference system has spread as a form 
based on an ITU-T recommendation. ITU-T transmitted and received an image and voice by call control 
(H. 225), communications control (H. 245), etc. between ISDN or the terminal of everything but IP 
screen oversize, and, specifically, has realized the television conference with the terminal (a H.320 
terminal, H.323 terminal) based on H.320 (whole video conferencing terminals, such as ISDN, are 
specified) advised as an international-standards method of a video conference system, or H.323 (video 
conferencing terminals, such as IP network, are specified). 

[0003] In order to realize the conventional video conference system which performs a television 
conference among these many points, the multi-point control unit (MCU:Multi -Point Control Unit) 
which carries out connection management of between the terminals between many points needs to be 
arranged. MCU has managed the terminal connected to the video conference system, and each terminal 
gets to know the transmission place of image data and voice data, and it consists of notifying the 
information on all the terminals that MCU equipment has connected to each terminal so that an image 
and voice data may be transmitted and received. 

[0004] A television conference can be performed now also between a H.320 terminal and H.323 
terminal by installing the gateway which carries an ISDN interface and a LAN interface and performs 
each protocol conversion by the request that he wants to extend the topology of a video conference 
system in recent years between an ISDN network and an ISDN network. 

[0005] The video conference system between the many points in H.323 terminal connected with the 
H.320 terminal connected to the ISDN screen oversize as an example of such a conventional video 
conference system at LAN is explained with reference to drawing 9 . 

[0006] H. 323 terminals 101 are connected to LAN, and the H.320 terminal 102 is connected to LAN 
through the gateway unit 103 while connecting with ISDN. Moreover, on LAN, the MCU equipment 
104 which performs connection management of the multi -point terminal which performs a television 
conference is connected. 

[0007] H. 323 terminals 101 perform the connection request which includes the information in the end 
of a local to MCU equipment 104 via LAN in order to participate in a video conference system. MCU 
equipment 104 will notify the purport that managed the information on the terminal concerned and 
H.323 terminal 101 was connected to all the terminals by which current connection is made, with the 
information on the terminal concerned, if connection of H.323 terminal 101 is permitted. By this 
processing, H.323 terminal 101 becomes possible [ performing a television conference ]. 
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[0008] Moreover, although the H.320 terminal 102 performs the connection request which includes the 
information in the end of a local like H.323 terminal, from the H.320 terminal 102, it transmits to 
gateway equipment 103 in a H.320 format, and it is that a gateway unit 103 changes the demand 
concerned into H.323 format, and a connection request is performed to MCU equipment 104. 
Connection authorization processing is performed by this and the reverse root, and the purport that the 
H.320 terminal 102 was connected to all the terminals by which current connection is made is notified 
with the information on the terminal concerned like H.323 terminal 101. By this processing, the H.320 
terminal 102 also becomes possible [ performing a television conference ]. 

[0009] Thus, the connected video conference system is a form based on an ITU-T recommendation, and 

a television conference will be performed by transmitting and receiving the data of H.323. 

[0010] 

[Problem(s) to be Solved by the Invention] In such a video conference system, since the television 
conference was held in the form based on an ITU-T recommendation, if the equipment which had to use 
a gateway unit 103 and MCU equipment 104 based on the ITU-T recommendation concerned, and was 
based on these ITU-T recommendations was not used, a television conference was not able to be held. 
[001 1] Since the data which a topology is restricted since there is nothing if a topology, a procedure, 
data format, etc. are not followed and it is **** in order to be based on an ITU-T recommendation, and 
communicate became large, it was difficult to spoil real time nature, if the large television conference of 
a scale is performed by narrow LAN of a band, and to build an efficient video conference system. 
[0012] since compressibility is high for whether your being Haruka compared with the data format in the 
conventional video conference system, if there are MPEG-4, MP3, etc. as a high compression means of 
data, and the data of MPEG-4 or MP3 can perform a television conference now --******-- since 
narrow LAN can also perform a real time television conference and the more flexible topology of it 
becomes possible, implementation of this video conference system is desired. 
[0013] 

[Means for Solving the Problem] Then, this invention is set to the video conference system which has 
the server equipment which manages connection and a communication link of a video conference 
system, and two or more terminals connected to the server equipment concerned in order to solve the 
above-mentioned technical problem. The image input device into which said terminal inputs image data, 
and the audio input unit which inputs voice data, With the connection Management Department which 
manages the terminal which has the transmitting processing section which transmits said image data and 
voice data to said server equipment, and connects said server equipment to a video conference system It 
has the message distribution processing section which carries out streaming distribution to all the 
terminals that have managed the speaker selection Management Department which chooses a speaker 
and sends out a selection signal, all the image data distributed from said two or more terminals, voice 
data, and said selection signal at said connection Management Department. Further said terminal The 
image output unit which displays the image data distributed from said server equipment, Suppose that it 
has the audio output device which outputs the voice data chosen from the selection signal and voice data 
which have been distributed from said server equipment in the speaker selection processing section 
which chooses the voice data corresponding to a speaker, and the speaker selection processing section 
concerned. 

[0014] In the video conference system which has the server equipment which manages connection and a 
communication link of a video conference system, and two or more terminals connected to the server 
equipment concerned moreover, said terminal The image input device which inputs image data, and the 
audio input unit which inputs voice data, The encoding processing section which encodes said image 
data and voice data to image compressed data and speech compression data (henceforth "multimedia 
data"), respectively, It has the transmitting processing section which transmits the multimedia data 
concerned to said server equipment. Said server equipment It has the message distribution processing 
section which carries out streaming distribution to all the terminals that have managed the connection 
Management Department which manages the terminal linked to a video conference system, and all the 
multimedia data transmitted from said two or more terminals at said connection Management 
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Department. Furthermore, the decoding section which decrypts the image compressed data and speech 
compression data with which said terminal has been distributed from said server equipment to image 
data and voice data, respectively, Suppose that it has the image output unit which was decrypted in the 
decoding section and which outputs the image data concerned, and the audio output device which 
outputs the voice data concerned. 

[0015] In this case, it has the speaker selection Management Department which chooses a speaker as 
either said terminal or said server equipment, and sends out a selection signal to it with MP3 data, and, 
as for said terminal, it is desirable to have the speaker selection processing section which chooses the 
MP3 data corresponding to a speaker based on said selection signal, and sends out only the MP3 data 
concerned to said decoding section. 

[0016] Moreover, it has the speaker selection Management Department which chooses a speaker as 
either said terminal or said server equipment, and sends out a selection signal to it with MP3 data, and, 
as for said server equipment, it is desirable to have the speaker selection processing section which 
chooses the MP3 data corresponding to a speaker based on said selection signal, and carries out 
streaming distribution only of the MP3 data concerned. 

[0017] Furthermore, said image compressed data is used as MPEG-4 data, and, as for said speech 
compression data, considering as MP3 data is desirable. 

[0018] Moreover, it sets to the video conference system which has server equipment which manages 
connection and a communication link of a video conference system. The server equipment which adopts 
the first coding method, and the first terminal which adopts the first coding method, It has transform- 
processing equipment which changes mutually the second terminal which adopts the second coding 
method, the first coding method, and the second coding method. Said first terminal The image input 
device which inputs image data, and the audio input unit which inputs voice data, The encoding 
processing section which encodes said image data and voice data to image compressed data and speech 
compression data (henceforth "multimedia data"), respectively, The decoding section which decrypts the 
multimedia data which have the transmitting processing section which transmits the multimedia data 
concerned to said server equipment, and were distributed from said server equipment to image data and 
voice data, respectively, It has the image output unit which was decrypted in the decoding section and 
which outputs the image data concerned, and the audio output device which outputs the voice data 
concerned. Said server equipment With the connection Management Department which manages said 
first terminal linked to a video conference system, and said transform-processing equipment It has the 
message distribution processing section which carries out streaming distribution to all the terminals that 
have managed all the multimedia data by which it is transmitted from said two or more terminals, and 
said selection signal at said connection Management Department. The image input port where said 
second terminal inputs image data, and the voice input port which inputs voice data, The image output 
port which outputs image data, and the audio output port which outputs voice data, It has the 
input/output processor which performs radial transfer between said transform-processing equipment 
while changing into the second coding method the image and voice data which are outputted and 
inputted in said port. Suppose said transform-processing equipment that it has the function which 
changes mutually the coding method of the data transmitted from said server equipment or said second 
terminal, and is transmitted to another side. 

[0019] In this case, it has the speaker selection Management Department which chooses a speaker as 
either said first terminal, said second terminal or said server equipment, and sends out a selection signal 
to it with voice data or MP3 data, and, as for said transform-processing equipment, it is desirable to have 
the speaker selection processing section which chooses the MP3 data corresponding to a speaker based 
on said selection signal, and sends out only the MP3 data concerned to said decoding section. 
[0020] Moreover, it has the speaker selection Management Department which chooses a speaker as 
either said first terminal, said second terminal or said server equipment, and sends out a selection signal 
to it with MP3 data, and suppose said server equipment that it has the speaker selection processing 
section which chooses the MP3 data corresponding to a speaker based on said selection signal, and 
carries out streaming distribution only of the MP3 data concerned. 
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[0021] Moreover, said image compressed data is used as MPEG-4 data, and, as for said speech 

compression data, considering as MP3 data is desirable. 

[0022] 

[Embodiment of the Invention] Next, the gestalt of the 1st operation concerning this invention is 
explained with reference to a drawing. Drawing 1 is drawing having shown an example of the topology 
of the video conference system (MPEG-4 system) concerning this invention. On LAN, the terminals 1 
and 2 which are two or more MPEG-4 terminals, and the server equipment 3 which performs connection 
and communications control of a video conference system are connected. Although two or more MPEG- 
4 terminals are connected on LAN in fact, two sets only of MPEG-4 terminals are explained as an 
example for convenience. 

[0023] The image input unit 4 which carries out the capture of the image from a camera etc. so that 
terminals 1 and 2 may be illustrated by drawing 2 , The encoding (coding) processing section 5 which 
compresses into MPEG-4 the image which carried out the capture with the image input device 4, The 
MPEG-4 data transmitting processing section 6 which transmits MPEG-4 data encoded by MPEG-4 to 
server equipment 3 through LAN in the encoding processing section 5, The audio input unit 7 which 
carries out [ voice ] a capture from a microphone etc., and the encoding (coding) processing section 8 
which compresses into MP3 the voice which carried out the capture with the audio input unit 7, It has 
the MP3 data transmitting processing section 9 which transmits the MP3 data encoded by MP3 to server 
equipment 3 through LAN in the encoding processing section 8. Moreover, the MPEG-4 data reception 
section 13 which receives MPEG-4 data sent through LAN from server equipment 3, and is sent out to 
the processing [ degree ] section, The decoding section 12 which decodes MPEG-4 data sent from the 
MPEG-4 data reception section 13 to image data, The image output unit 1 1 which displays the image 
decoded in the decoding section 12 on a display etc., The MP3 data reception section 17 which receives 
the selection signal which chooses from server equipment 3 the speaker of the MP3 data sent through 
LAN, and a television conference, and is sent out to the processing [ degree ] section, The speaker 
selection processing section 16 which chooses the MP3 data from a speaker terminal based on a 
selection signal, and is sent out to the processing [ degree ] section, The decoding section 15 which 
decodes the MP3 data chosen in the speaker selection processing section 16 to voice data, With the 
connection Management Department 18 which has the audio output device 14 which outputs the voice 
decoded in the decoding section 15 to a loudspeaker etc., and performs connection processing to a video 
conference system further The control unit 19 which operates connection Management Department 18 
grade, and MPEG-4 data and MP3 data (it is called "multimedia data" below.) It consists of LAN 
interfaces 10 which carry out an interface between LANs. 

[0024] Here, the low bit rate coding method of the image towards the communication link of the Internet 
etc. and voice is said in MPEG-4, the compression approach differs from means of communications, and 
MPEG-2 are characterized by compressibility being higher than MPEG-2 about 10 times. Moreover, 
MP3 (MPEG-l Audio Layer 3) means the audio compression technology of MPEG- 1 which is the 
compression technology of an image, and is the speech compression means by which compressibility is 
present the highest. 

[0025] Although an image and voice are divided, transmitted and received to MPEG-4 data and MP3 
data in this example, in order for voice data to choose only the voice data from a speaker to displaying 
all image data and to make it output to a loudspeaker etc. from an audio output device 14, it divides, 
encodes and ******, so that this may be explained later. 

[0026] The LAN interface 20 with which a server 3 performs LAN and an interface as shown in drawing 
3 , The MPEG-4 reception section 21 which receives inner MPEG-of multimedia data 4 data, The 
MPEG-4 data message distribution processing section 22 which distributes MPEG-4 data received in the 
MPEG-4 reception section 21 to all the terminals connected to the video conference system, The MP3 
reception section 23 which receives the MP3 data in multimedia data, While the television conference is 
held with the MP3 data message distribution processing section 24 which distributes the MP3 data 
received in the MP3 reception section 23 to all the terminals connected to the video conference system, 
on the screen (not shown) of server equipment 3 It consists of a control unit 29 which chooses which 
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terminal a voice is given, and the speaker selection Management Department 30 which distributes the 
selection signal which specified the terminal chosen by the control unit 29 to all terminals. 
[0027] With moreover, the connection terminal IP address Management Department 26 where the 
connection Management Department 25 manages the IP address (henceforth an "IP address") of the 
terminal connected to the video conference system The access permit list 27 which has identification 
information, such as an IP address of the terminal which may participate in a video conference system, 
In case it communicates with a video conference system, communication link information (henceforth 
"communication link information"), such as a possible bit rate and a frame rate, is set up, and it consists 
of the data setting Management Department 28 which manages the set-up communication link 
information for every terminal. 

[0028] In addition, although the speaker is chosen by the control unit 29 of server equipment 3 in this 
example, other approaches generally found are sufficient as the selection approach concerned, and it is 
also possible to perform speaker selection by the same approach at other terminals. 
[0029] Next, as actuation of this example, the case where a terminal 1 and a terminal 2 perform a 
television conference is explained, referring to drawing 4 . 

[0030] In order that a terminal 1 may participate in a television conference, the connection Management 
Department 18 sends out a connection-request signal to server equipment 3. If server equipment 3 
receives the connection-request signal from a terminal 1, the connection Management Department 25 
will perform the existence of the number limit of connection terminals, and television conference 
participating authorization, and a multimedia data setting situation check by the connection terminal IP 
address Management Department 26, the access permit list 27, and the data setting Management 
Department 28. 

[003 1] First, although it is the number limit of connection terminals, since the IP address of a terminal 
which is making current connection at the connection terminal IP address Management Department 26 is 
managed, if the number of connection terminals concerned is compared with the number of management 
permissible terminals and the number of connection terminals has reached the number of management 
permissible terminals, a connection disapproval signal will be sent out to a terminal 1, and a connection 
request will be canceled. If the number of management permissible terminals is not reached, the 
existence of television conference participating authorization is judged next. 

[0032] Since it has as a list the IP address of the terminal which may participate in a video conference 
system, whether the IP address of a terminal 1 is included in the list concerned judges an access permit 
list 27, and if not contained, it is judged to be what participation of a television conference is not 
permitted, sends out a connection disapproval signal to a terminal 1, and cancels a connection request. If 
contained in the list concerned, a multimedia data setting situation check will be performed next. 
[0033] Since the communication link information which can communicate is set up, when it judges 
whether it is the range of the communication capability permitted at the data setting Management 
Department 28 and is over communication capability from the communication link information on the 
communication link of the terminal 1 included in the connection-request signal from the terminal 1, the 
data setting Management Department 28 sends out a connection disapproval signal to a terminal 1, and 
cancels a connection request While it begins when it is not over communication capability, and 
transmitting a connection enabling signal to a terminal 1, the IP address and communication link 
information on a terminal 1 are recorded on the connection Management Department 25. 
[0034] It can recognize that the terminal 1 was connectable with the video conference system by 
receiving the connection enabling signal from server equipment 3 at the connection Management 
Department 18, and, thereby, a terminal 1 can participate in a video conference system. 
[0035] In order that a terminal 2 may also participate in a television conference, the same connection 
processing as a terminal 1 is performed. 

[0036] Next, the actuation in television conference session is explained. 

[0037] The terminal 1 which obtained connection authorization carries out the capture of the image data 
from a camera etc. with the image input device 4, and carries out the capture of the voice data from a 
microphone etc. with an audio input unit 7. If the image data which carried out the capture with the 
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image input device 8 are received, with MPEG-4 codec software etc., the encoding processing section 5 
will encode the image data concerned to MPEG-4 data, and will transmit to the MPEG-4 data message 
distribution processing section 6. 

[0038] On the other hand, if the voice data which carried out the capture with the audio input unit 7 is 
received, with MP3 codec software etc., the encoding processing section 8 will carry out MP3 coding of 
the voice data concerned, and will transmit to the MP3 data message distribution processing section 9. 
[0039] In addition, each encoding processing section 5 and 8 analyzed the coding processing situation of 
another side, delayed coding of 1 in all side to coding processing of the direction with much delay, and it 
has transmitted to the MPEG-4 data message distribution processing section 6 and the MP3 data 
message distribution processing section 9, taking both synchronizations. 

[0040] The MPEG-4 data transmitting processing section 6 and the MP3 data transmitting processing 
section 9 set up a server 3 as each transmission place, and transmit multimedia data to server equipment 
3 through LAN. 

[0041] If a server 3 receives multimedia data with the LAN interface 20, MPEG-4 data will be sent to 
the MPEG-4 data reception section 21 among the multimedia data concerned, and MP3 data will be sent 
to the MP3 data reception section 23. This multimedia data is sent to the MPEG-4 data message 
distribution processing section 22 and the MP3 data message distribution processing section 24 in a 
format as it is, and the MPEG-4 data message distribution processing section 22 and the MP3 data 
message distribution processing section 24 set up the IP address of the terminal registered into the 
connection terminal IP address Management Department 26 of the connection Management Department 
25 as a transmission place of multimedia data which received, respectively, and carry out multicast 
(streaming) distribution through the LAN interface 20 at LAN. Thus, server equipment 3 is carrying out 
streaming distribution of the multimedia data transmitted from all the terminals linked to a video 
conference system in the format as it is at LAN, without processing the received multi-DIA data, 
thereby, the real time nature in server equipment 3 is markedly alike, and improves. 
[0042] Moreover, during television conference holding, in server equipment 3, a speaker is recognized 
from a display (not shown) and speaker selection is made by the control unit 29. The speaker selection 
Management Department 30 investigates the terminal corresponding to the speaker chosen by the 
control unit 29, and the selection signal concerned which specifies a speaker with the MP3 data from the 
MP3 data message distribution processing section 24 also carries out streaming distribution together. 
[0043] If streaming distribution of the multimedia data is carried out from server equipment 3, the data 
addressed to the end of a local judge whether it is no at the LAN interface 10, and if the terminals 1 and 
2 linked to a video conference system are data addressed to the end of a local, they will incorporate the 
multimedia data concerned. 

[0044] MPEG-4 data are received in the MPEG-4 data reception section 13 among this incorporated 
multimedia data, and MP3 data and a selection signal are received in the MP3 data reception section 17. 
If MPEG-4 data received in the MPEG-4 data reception section 13 are received, with MPEG-4 codec 
software etc., the decoding section 12 will decrypt the MPEG-4 data concerned to image data, and will 
transmit to the image output unit 1 1 . At this time, it processes so that there may be the image output- 
control processing section (not shown) which performs processing of the display screen for television 
conferences etc. along the display setting screen for television conferences set up in the end of a local 
and the image data from each terminal may be displayed. That is, the image data (MPEG-4 data) sent 
from each terminal is distributed without being processed in any way, and he is trying to display it at 
this example by the display format for television conferences which began in the end of a local and was 
set up beforehand. Carrying out comparatively [ division-into-equal -parts ], or displaying only a 
speaker, or making a speaker a rise and carrying out others comparatively [ division-into-equal-parts ] as 
an example of a display, etc. is considered. 

[0045] On the other hand, from a selection signal, the speaker selection processing section 16 will 
choose only the MP3 data from the terminal corresponding to the speaker concerned, and the MP3 data 
reception section 17 will transmit it to the decoding section 15, if the MP3 data and the selection signal 
which were received are transmitted to the speaker selection processing section 16. If the MP3 data 
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chosen in the speaker selection processing section 16 are received, with MP3 codec software etc., the 
decoding section 15 will decrypt the MP3 data concerned to voice data, and will transmit to an audio 
output device 14. Here, it transmitted to the decoding section 15 and since the real time nature in a 
television conference would fall with the load of processing if all MP3 data are decrypted, only the MP3 
data corresponding to a speaker were decrypted for making only the MP3 data corresponding to the 
speaker who is necessary minimum decrypt. 

[0046] In addition, each decoding section 12 and 19 analyzed the decryption processing situation of 
another side, delayed the decryption of 1 in all side to decryption processing of the direction with much 
delay, and it has transmitted to the image output unit 1 1 and the audio output device 14, taking both 
synchronizations. 

[0047] Thus, the image data and voice data which were decrypted display an image on displays, such as 
a display, with the image output unit 1 1, and voice is outputted to a loudspeaker etc. from an audio 
output device 14. 

[0048] By performing such processing, a television conference can be performed between a terminal 1 
and 2. 

[0049] Next, the gestalt of the 2nd operation concerning this invention is explained. Although the gestalt 
of the 1st operation explained the video conference system of MPEG-4 terminals, the video conference 
system which also includes H.323 terminal here is explained. 

[0050] Drawing 5 is drawing having shown the topology of the video conference system in the gestalt of 
the 2nd operation. The transform-processing equipment 33 which changes between server equipment 3, 
the terminal 3 1 which is MPEG-4 terminal, H.323 terminal 34 based on H.323, and MPEG-4 format and 
H.323 format is connected to this video conference system by LAN. 

[0051] Since the configuration of server equipment 3 and a terminal 3 1 is the same as that of the server 
equipment 3 and the terminals 1 and 2 which were explained with the gestalt of the 1 st operation, it 
omits explanation. 

[0052] H. 323 terminals 32 are equipped with the image voice-input/output processor 45 which performs 
radial transfer of H.323 formal data between the image input port 41 which inputs an image from video 
etc., the voice input port 42 which inputs voice from a microphone etc., the image output port 43 which 
displays an image on a display etc., the audio output ports 44 which output voice to a loudspeaker etc., 
and these ports and LANs as shown in drawing 7 . 

[0053] Transform-processing equipment 33 shows what combined MPEG-4 terminal 47 and H.323 
terminal 46 with the gestalt of this operation, as shown in drawing 6 , but if it has the function of 
MPEG-4 terminal 47 and the function of H.323 terminal 46 which are shown below, it is sufficient for 
it. 

[0054] The image output port 36 and audio output port 37 of H.323 terminal 46 are connected with the 
image input unit 4 of MPEG-4 terminal 47, and an audio input unit 7, respectively, and the image output 
unit 1 1 and audio output device 14 of MPEG-4 terminal 47 are connected with the image input port 34 
of H.323 terminal 46, and the voice input port 35 for transform-processing equipment 33, respectively. 
In addition, since the configuration of H.323 terminal and MPEG-4 terminal is the same in having 
described above, explanation here is omitted. 

[0055] Next, as actuation of this example, the case where H.323 terminal 32 and a terminal 3 1 perform a 
television conference is explained with reference to drawing 8 . 

[0056] In order that a terminal 3 1 and transform-processing equipment 32 (MPEG-4 terminal part) may 
participate in a television conference first, a connection request is performed from the connection 
Management Department 18 to server equipment 3, and a terminal 3 1 and transform -processing 
equipment 35 are connected to a video conference system by the same processing as the gestalt of the 
1st operation. Thereby, transform -processing equipment 35 will be in a ready -for-sending ability 
condition about multimedia data to server equipment 3, and the distribution from server equipment 3 
will also be in a ready-for-receiving ability condition. 

[0057] Then, H.323 terminal 32 will be substantially connected to a video conference system by 
performing connection processing with the transform-processing equipment 33 connected to the video 
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conference system. 

[0058] Next, the actuation in television conference session is explained with reference to drawing 8 . In 
addition, drawing 8 is the flow chart which showed the actuation in transform-processing equipment 33. 
[0059] H. 323 terminals 32 are the image I/O 36 about the image data from a camera etc., and if the 
capture of the voice data is carried out with voice-input/output equipment 37 from a microphone etc., the 
image voice-input/output processor 45 will encode the data concerned which carried out the capture in 
H.323 format which is digital data, and they will transmit the H.323 data concerned to transform- 
processing equipment 33 through LAN. 

[0060] With transform-processing equipment 33, H.323 data inputted through LAN are changed into 
multimedia data, and the multimedia data concerned are transmitted to a server. If an input/output 
processor 40 receives H.323 data from H.323 terminal 32 through LAN, the H.323 data concerned will 
be changed into the data (henceforth "analog data") of a composite image and an analog phonological 
form, and, specifically, the analog data concerned will be outputted to the image input unit 4 and an 
audio input unit 7 from the image output port 43 and an audio output port 44. In addition, the image 
output port 36, the image input unit 4, and an audio output port 37 and an audio input unit 5 are offered 
at H.323 conventional terminal and MPEG-4 terminal, and are mutually connected in a camera, a 
microphone, a display, and a loudspeaker terminal. The same is said of the image output unit 1 1 
explained later, image input port 34 and an audio output device 14, and the voice input port 35. 
[0061] At MPEG-4 terminal of transform-processing equipment 33, if the capture of the image input- 
device 4 and audio input unit 7 analog data is carried out, the analog data concerned will be sent to the 
encoding processing sections 5 and 8. The encoding processing sections 5 and 8 encode each analog 
data to MPEG-4 and MP3, and transmit to server equipment 3 through LAN. Same processing is 
performed with the gestalt of the 1st operation having explained these transmitting processings and the 
message distribution processing in server equipment 3. 

[0062] If multimedia data are distributed from server equipment 3, transform-processing equipment 33 
will receive MPEG-4 data in the MPEG-4 data reception section 13 of MPEG-4 terminal in self- 
equipment, and will receive MP3 data and a selection signal in the MP3 data reception section 17. 
[0063] The decoding section 12 will decode MPEG-4 data to a composite video signal, if MPEG-4 data 
are received from the MPEG-4 data reception section 13, and if the MP3 data chosen in the speaker 
selection processing section are received, the decoding section 15 decodes the MP3 data concerned to 
analog voice, and it sends it out to H.323 terminal 46 from the image output unit 1 1 and an audio output 
device 14, respectively. 

[0064] H. The voice image input/output processor 40 will encode the analog data concerned to H.323 
data, and 323 terminals 46 will transmit it to H.323 terminal 32, if the analog data concerned is received 
in image input port 34 and the voice input port 35. 

[0065] H. The image voice-input/output processor 45 of 323 terminals 32 carries out the capture of the 
analog data received from image input port 41 and the voice input port 42, displays an image on a 
display from the image output port 43, and makes voice output to a loudspeaker from an audio output 
port 44. 

[0066] By performing such processing, a television conference can be performed between the terminals 
with which protocols differ, without being mutually conscious of the difference in a protocol. 
[0067] 

[Effect of the Invention] Since server equipment manages the information on the terminal linked to a 
video conference system according to this invention as explained above and each terminal distributes a 
multimedia signal to all the terminals to which server equipment was connected that what is necessary is 
to transmit a multimedia signal only to server equipment, load reduction in each terminal can be aimed 

at. 

[0068] Moreover, since coding methods, such as MPEG-4 which are a high compression method, and 
MP3, can be treated, even if it is LAN which is the narrow network of a band, a network will be in a 
congestion condition neither by each terminal - transmission nor distribution from server equipment, but 
the high video conference system of real time nature can be held. 
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[0069] Furthermore, by encoding image data and voice data with another coding method, in case the 
voice data corresponding to a speaker is chosen, image data can be independently chosen now and can 
simplify the equipment for speaker selection. 

[0070] Moreover, a video conference system with more high real time nature can be held by decrypting 
only the encoded voice data corresponding to a speaker. 

[0071] Furthermore, since it becomes possible to constitute a video conference system only from a 
configuration which makes the transform-processing equipment which can use MPEG-4 general- 
purpose terminal, H.323 terminal, etc. as it is mediate even if it faces holding a television conference 
between the terminals of different specification and does not install an expensive dedicated terminal, a 
video conference system can be constituted more flexibly and cheaply. 

[Translation done.] 
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* NOTICES * 

JPO and NCIPI are not responsible for any 
damages caused by the use of this translation. 

1. This document has been translated by computer. So the translation may not reflect the original 
precisely. 

2. **** shows the word which can not be translated. 
3. In the drawings, any words are not translated. 



TECHNICAL FIELD 



[Field of the Invention] This invention relates to the video conference system which connects mutually 
between remote places and performs a television conference. 
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PRIOR ART 



[Description of the Prior Art] In recent years, the video conference system which connects mutually 
among the many points in a remote place, and performs a television conference by development of 
communication technology has been developed. Such a video conference system has spread as a form 
based on an ITU-T recommendation. ITU-T transmitted and received an image and voice by call control 
(H. 225), communications control (H. 245), etc. between ISDN or the terminal of everything but IP 
screen oversize, and, specifically, has realized the television conference with the terminal (a H.320 
terminal, H.323 terminal) based on H.320 (whole video conferencing terminals, such as ISDN, are 
specified) advised as an international-standards method of a video conference system, or H.323 (video 
conferencing terminals, such as IP network, are specified). 

[0003] In order to realize the conventional video conference system which performs a television 
conference among these many points, the multi-point control unit (MCU:Multi -Point Control Unit) 
which carries out connection management of between the terminals between many points needs to be 
arranged. MCU has managed the terminal connected to the video conference system, and each terminal 
gets to know the transmission place of image data and voice data, and it consists of notifying the 
information on all the terminals that MCU equipment has connected to each terminal so that an image 
and voice data may be transmitted and received. 

[0004] A television conference can be performed now also between a H.320 terminal and H.323 
terminal by installing the gateway which carries an ISDN interface and a LAN interface and performs 
each protocol conversion by the request that he wants to extend the topology of a video conference 
system in recent years between an ISDN network and an ISDN network. 

[0005] The video conference system between the many points in H.323 terminal connected with the 
H.320 terminal connected to the ISDN screen oversize as an example of such a conventional video 
conference system at LAN is explained with reference to drawing 9 . 

[0006] H. 323 terminals 101 are connected to LAN, and the H.320 terminal 102 is connected to LAN 
through the gateway unit 103 while connecting with ISDN. Moreover, on LAN, the MCU equipment 
104 which performs connection management of the multi -point terminal which performs a television 
conference is connected. 

[0007] H. 323 terminals 101 perform the connection request which includes the information in the end 
of a local to MCU equipment 104 via LAN in order to participate in a video conference system. MCU 
equipment 104 will notify the purport that managed the information on the terminal concerned and 
H.323 terminal 101 was connected to all the terminals by which current connection is made, with the 
information on the terminal concerned, if connection of H.323 terminal 101 is permitted. By this 
processing, H.323 terminal 101 becomes possible [ performing a television conference ]. 
[0008] Moreover, although the H.320 terminal 102 performs the connection request which includes the 
information in the end of a local like H.323 terminal, from the H.320 terminal 102, it transmits to 
gateway equipment 103 in a H.320 format, and it is that a gateway unit 103 changes the demand 
concerned into H.323 format, and a connection request is performed to MCU equipment 104. 
Connection authorization processing is performed by this and the reverse root, and the purport that the 
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H.320 terminal 102 was connected to all the terminals by which current connection is made is notified 
with the information on the terminal concerned like H.323 terminal 101. By this processing, the H.320 
terminal 102 also becomes possible [ performing a television conference ]. 

[0009] Thus, the connected video conference system is a form based on an ITU-T recommendation, and 
a television conference will be performed by transmitting and receiving the data of H.323. 
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EFFECT OF THE INVENTION 



[Effect of the Invention] Since server equipment manages the information on the terminal linked to a 
video conference system according to this invention as explained above and each terminal distributes a 
multimedia signal to all the terminals to which server equipment was connected that what is necessary is 
to transmit a multimedia signal only to server equipment, load reduction in each terminal can be aimed 
at. 

[0068] Moreover, since coding methods, such as MPEG-4 which are a high compression method, and 
MP3, can be treated, even if it is LAN which is the narrow network of a band, a network will be in a 
congestion condition neither by each terminal - transmission nor distribution from server equipment, but 
the high video conference system of real time nature can be held. 

[0069] Furthermore, by encoding image data and voice data with another coding method, in case the 
voice data corresponding to a speaker is chosen, image data can be independently chosen now and can 
simplify the equipment for speaker selection. 

[0070] Moreover, a video conference system with more high real time nature can be held by decrypting 
only the encoded voice data corresponding to a speaker. 

[0071] Furthermore, since it becomes possible to constitute a video conference system only from a 
configuration which makes the transform-processing equipment which can use MPEG-4 general- 
purpose terminal, H.323 terminal, etc. as it is mediate even if it faces holding a television conference 
between the terminals of different specification and does not install an expensive dedicated terminal, a 
video conference system can be constituted more flexibly and cheaply. 
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TECHNICAL PROBLEM 

[Problem(s) to be Solved by the Invention] In such a video conference system, since the television 
conference was held in the form based on an ITU-T recommendation, if the equipment which had to use 
a gateway unit 103 and MCU equipment 104 based on the ITU-T recommendation concerned, and was 
based on these ITU-T recommendations was not used, a television conference was not able to be held. 
[001 1] Since the data which a topology is restricted since there is nothing if a topology, a procedure, 
data format, etc. are not followed and it is **** in order to be based on an ITU-T recommendation, and 
communicate became large, it was difficult to spoil real time nature, if the large television conference of 
a scale is performed by narrow LAN of a band, and to build an efficient video conference system. 
[0012] since compressibility is high for whether your being Haruka compared with the data format in the 
conventional video conference system, if there are MPEG-4, MP3, etc. as a high compression means of 
data, and the data of MPEG-4 or MP3 can perform a television conference now — ****♦* — since 
narrow LAN can also perform a real time television conference and the more flexible topology of it 
becomes possible, implementation of this video conference system is desired. 
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DESCRIPTION OF DRAWINGS 



[Brief Description of the Drawings] 

[Drawing 1] Drawing having shown the topology of the 1st video conference system concerning this 
invention. 

[Drawing 2] The block diagram having shown MPEG-4 terminal in the gestalt of the 1st operation. 
[Drawing 3] The block diagram having shown the server equipment in the gestalt of the 1st operation. 
[Drawing 4] The flow chart in the video conference system of the MPEG-4 terminals in the gestalt of the 
1st operation. 

[Drawing 5] Drawing having shown the topology of the 2nd video conference system concerning this 
invention. 

[Drawing 6] The block diagram having shown H.323 terminal in the gestalt of the 2nd operation. 
[Drawing 7] The block diagram having shown the transform-processing equipment in the gestalt of the 
2nd operation. 

[Drawing 8] The flow chart in the video conference system of the H.323 terminal and MPEG-4 terminal 
in the gestalt of the 2nd operation. 

[Drawing 9] Drawing having shown the topology of the video conference system between the 
conventional H.320 terminal and H.323 terminal. 
[Description of Notations] 

I Two MPEG-4 terminal 

3 Server Equipment 

4 Image Input Unit 

5 Eight Encoding processing section 

6 MPEG-4 Data Transmitting Processing Section 

7 Audio Input Unit 

9 MP3 Data Transmitting Processing Section 

10 20 LAN interface 

I I Image Output Unit 

12 15 Decoding section 

13 MPEG-4 Data Reception Section 

14 Audio Output Device 

16 Speaker Selection Processing Section 

17 MP3 Data Receive Section 

18 Connection Management Department 

19 Control Unit 

21 MPEG-4 Data Reception Section 

22 MPEG-4 Data Message Distribution Processing Section 

23 MP3 Data Reception Section 

24 MP3 Data Message Distribution Processing Section 

25 Connection Management Department 
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26 Connection Terminal IP Address Management Department 

27 Access Permit List 

28 Data Setting Management Department 

29 Control Unit 

30 Speaker Selection Management Department 

31 MPEG-4 Terminal 

32 H.323 Terminal 

33 Transform-Processing Equipment 

34 41 Image input port 

35 42 Voice input port 

36 43 Image output port 

37 44 Audio output port 

40 45 Image voice-input/output processor 

46 H.323 Terminal 

47 MPEG-4 Terminal 
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DRAWINGS 



[Drawing 1] 
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[Drawing 2] 
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[Drawing 3] 
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[Drawing 5] 
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[Drawing 8] 
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